feat(tasks): richer task generation with recursive decomposition (#420) by frankbria · Pull Request #444 · frankbria/codeframe

frankbria · 2026-03-15T04:01:11Z

Summary

Closes #420

Rich metadata in task generation: Enhanced LLM prompt produces tasks with complexity_score, estimated_hours, uncertainty_level, depends_on, and files_to_modify — fields that existed in the model but were never populated
Recursive decomposition engine: classify_task() → decompose_task() → generate_task_tree() breaks PRDs into hierarchical atomic/composite task trees (inspired by tinyagi/fractals)
Tree-structure fields: Added parent_id, lineage, is_leaf, hierarchical_id to Task model and DB schema
Status propagation: Children done → parent auto-completes; child failed → parent fails
CLI commands: cf tasks generate --recursive [--max-depth N] and cf tasks tree (ASCII hierarchy display)
Lineage context: Task ancestry included in agent prompts via ContextPackager

Files Changed

Area	Files	What
Model/Schema	`core/tasks.py`, `core/workspace.py`	4 new Task fields + DB columns
Rich Generation	`core/tasks.py`	Enhanced LLM prompt, dependency resolution
Tree Engine	`core/task_tree.py` (NEW)	classify, decompose, generate_tree, flatten, display, propagate
Context	`core/context_packager.py`	Lineage injection in agent prompts
CLI	`cli/app.py`	--recursive, --max-depth flags, `tree` command
Tests	5 test files	51 new tests

Test plan

51 new tests across 5 test files
All 2357 v2 tests pass (0 regressions)
Ruff lint clean
Demo cf tasks generate with rich metadata
Demo cf tasks tree with hierarchical display
Demo backward compatibility (generate without --recursive)

Summary by CodeRabbit

New Features
- Recursive task decomposition with configurable max depth and CLI flags
- New CLI command to display task hierarchy as an ASCII tree
- Tasks gain parent/child relationships, hierarchical IDs, lineage, and richer metadata (complexity, estimated hours, uncertainty)
- Lineage context is included in prompts to improve decomposition
Tests
- Extensive tests for recursive decomposition, tree display, schema fields, status propagation, and rich metadata generation

Add parent_id, lineage, is_leaf, and hierarchical_id fields to support hierarchical task decomposition. Includes DB schema, migrations, and CRUD operations with full backward compatibility.

Update the LLM prompt to request complexity, estimated_hours, uncertainty, depends_on_titles, and files_to_modify. Parse and validate these fields with clamping/defaults. Resolve title-based dependencies to task IDs post-creation.

…ns (#420) Implement Steps 3 and 4 of issue #420: task_tree module with LLM-powered classify/decompose/generate_task_tree for recursive decomposition, plus flatten_task_tree, display_task_tree, and propagate_status for tree management. 25 tests covering all functions.

…ommand (#420 steps 5-6) - Add --recursive/-r and --max-depth options to `cf tasks generate` for recursive task decomposition via generate_task_tree/flatten_task_tree - Add `cf tasks tree` command to display task hierarchy as ASCII tree - Add lineage context to TaskContextPackager.build() prompt when task has lineage data, appearing before verification gates - 11 new tests: 7 CLI tests + 4 context packager lineage tests

coderabbitai · 2026-03-15T04:01:30Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ffff2449-cc0b-4559-8d55-7027fe76e702

📥 Commits

Reviewing files that changed from the base of the PR and between 6dd347a and 1250ded.

📒 Files selected for processing (1)

tests/core/test_rich_task_generation.py

Walkthrough

Adds recursive task decomposition and tree management via a new task_tree module, extends Task schema with tree fields, integrates lineage into prompts, adds CLI flags for recursive generation and a tasks_tree command, and includes schema migrations plus comprehensive tests for tree generation, flattening, display, and propagation.

Changes

Cohort / File(s)	Summary
CLI Commands `codeframe/cli/app.py`	Added `--recursive` (`-r`) and `--max-depth` options to `tasks_generate`; branching to invoke `generate_task_tree` + `flatten_task_tree` when recursive, preserve existing LLM/no-LLM paths otherwise. Added new `tasks_tree` command that calls `display_task_tree` and handles errors.
Task Tree Module `codeframe/core/task_tree.py`	New module implementing `classify_task`, `decompose_task`, `generate_task_tree`, `_parse_subtasks`, `flatten_task_tree`, `display_task_tree`, `_render_node`, and `propagate_status`. Handles lineage context, JSON/markdown parsing, 2–7 subtask clamping/padding, hierarchical IDs, and ASCII tree rendering.
Task Model & Storage `codeframe/core/tasks.py`, `codeframe/core/workspace.py`	Added Task fields: `parent_id`, `lineage`, `is_leaf`, `hierarchical_id`. Updated `create`, `get`, `list_tasks`, and row conversion to persist/return these fields (lineage JSON, is_leaf int). Workspace schema migration adds corresponding columns with idempotent checks. Also enriches generated tasks with complexity, estimated_hours, uncertainty and resolves depends_on_titles → task IDs.
Context Integration `codeframe/core/context_packager.py`	Appends a "Task Lineage" section to packed prompts when a task has lineage, joining ancestor entries with → and placing the section before verification gates.
Tests — Task Tree & Schema `tests/core/test_task_tree.py`, `tests/core/test_task_tree_schema.py`	New comprehensive tests covering classification, decomposition (JSON and markdown-wrapped), recursive generation, flattening to workspace (hierarchical IDs, parent links), ASCII display, status propagation, and schema persistence/backward-compatibility for new fields.
Tests — CLI, Rich Generation, Context `tests/cli/test_tasks_tree_cli.py`, `tests/core/test_rich_task_generation.py`, `tests/core/test_context_packager.py`	CLI tests for `--recursive` and `--max-depth`, and paths for recursive vs non-recursive generation. Rich generation tests for complexity/estimated_hours/uncertainty/files/dependencies handling and edge cases. Context packager tests assert lineage section presence and ordering.

Sequence Diagrams

sequenceDiagram
    participant CLI as "tasks_generate CLI"
    participant TreeGen as "generate_task_tree"
    participant Classify as "classify_task"
    participant Provider as "LLM Provider"
    participant Decompose as "decompose_task"
    participant Flatten as "flatten_task_tree"
    participant Workspace as "Workspace DB"

    CLI->>TreeGen: Invoke with PRD description, lineage, max_depth
    TreeGen->>Classify: Ask: atomic or composite? (includes lineage)
    Classify->>Provider: LLM prompt (with lineage)
    Provider-->>Classify: "atomic" / "composite"
    alt atomic or depth limit reached
        Classify-->>TreeGen: Return leaf node
    else composite
        TreeGen->>Decompose: Request 2–7 subtasks (with lineage)
        Decompose->>Provider: LLM prompt (with lineage)
        Provider-->>Decompose: Subtasks (JSON / markdown)
        Decompose-->>TreeGen: Parsed subtasks
        loop per subtask
            TreeGen->>TreeGen: Recurse (lineage + parent, depth+1)
        end
    end
    TreeGen->>Flatten: Return nested tree to flatten
    Flatten->>Workspace: Create task rows (parent_id, lineage, hierarchical_id)
    Workspace-->>Flatten: Persisted task IDs
    Flatten-->>CLI: Completed

sequenceDiagram
    participant CLI as "tasks_tree CLI"
    participant Workspace as "Workspace DB"
    participant Display as "display_task_tree"
    participant Render as "_render_node"
    participant Console as "Console Output"

    CLI->>Display: Request tree for workspace
    Display->>Workspace: Load all tasks
    Workspace-->>Display: Return task list
    Display->>Display: Build parent→children map, sort roots
    loop per root
        Display->>Render: Render node line (status, kind, hierarchical id)
        Render-->>Console: Emit line
        loop per child
            Render->>Render: Recurse with indent
            Render-->>Console: Emit child line
        end
    end

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(core): enhance TaskContextPackager for external agents #432 — Modifies context_packager.py; overlaps with this PR’s insertion of the "Task Lineage" prompt section.
Return typed objects from database queries #135 — Changes Task model / DB row-to-object logic; relates to schema and Task field additions in this PR.

Poem

🐰
I nibble ideas into branch and root,
Tasks unfold like carrots, crisp and cute,
Lineage guides each hop and tiny part,
Leaves turn done when children play their part,
A recursive hop — the rabbit cheers, heart-smart!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main changes: richer task generation (metadata fields) and recursive decomposition as a new feature.
Linked Issues check	✅ Passed	The PR implementation fully addresses all coding requirements from issue `#420`: recursive decomposition engine, task tree structure, rich metadata, CLI commands, status propagation, lineage tracking, and comprehensive tests.
Out of Scope Changes check	✅ Passed	All changes align with the stated objectives; no out-of-scope modifications detected beyond the recursive decomposition, task tree, and associated infrastructure.
Docstring Coverage	✅ Passed	Docstring coverage is 98.73% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/issue-420-richer-task-generation

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

codeframe/cli/app.py (1)

1597-1616: ⚠️ Potential issue | 🟠 Major

Handle --recursive and --no-llm as mutually exclusive.

Line 1597 skips LLM validation when --no-llm is set, but Line 1601 still executes the LLM-only recursive path. cf tasks generate --recursive --no-llm currently has contradictory behavior and can fail with a confusing provider error path.

💡 Suggested fix

-        if not no_llm:
+        if recursive and no_llm:
+            console.print("[red]Error:[/red] --recursive cannot be combined with --no-llm")
+            raise typer.Exit(1)
+
+        if not no_llm:
             from codeframe.cli.validators import require_anthropic_api_key
             require_anthropic_api_key()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@codeframe/cli/app.py` around lines 1597 - 1616, The recursive branch
currently runs LLM-only logic even when --no-llm is passed; make the flags
mutually exclusive by validating them before using LLM functions: add a check
that if recursive and no_llm are both true, print an error (or raise/exit) and
abort, or change the recursive condition to "if recursive and not no_llm" so the
code only calls get_provider(), generate_task_tree, and flatten_task_tree when
an LLM is allowed; also ensure require_anthropic_api_key() is only called when
an LLM will actually be used (i.e., when not no_llm).

🧹 Nitpick comments (1)

tests/cli/test_tasks_tree_cli.py (1)

71-77: Assert result.exit_code in the happy-path CLI tests.

Right now both tests can pass even if the command crashes after calling the mocked function. An exit-code assertion will make the behavior check meaningful and prevent false positives.

Proposed fix

             result = runner.invoke(
                 app,
                 ["tasks", "generate", "--recursive", "-w", str(tmp_path)],
             )

+            assert result.exit_code == 0, result.output
             mock_gen_tree.assert_called_once()
             mock_flatten.assert_called_once()
@@
             result = runner.invoke(
                 app,
                 ["tasks", "generate", "-w", str(tmp_path)],
             )

+            assert result.exit_code == 0, result.output
             mock_gen.assert_called_once()

Also applies to: 104-109

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/cli/test_tasks_tree_cli.py` around lines 71 - 77, Add an explicit
exit-code assertion after the CLI invocation to ensure the command completed
successfully: after the runner.invoke(app, ["tasks", "generate", "--recursive",
"-w", str(tmp_path)]) call, assert result.exit_code == 0 (do the same for the
other similar test later in the file that also calls runner.invoke and asserts
mock_gen_tree/mock_flatten). This ensures the test fails if the CLI crashes even
when the mocks were called.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@codeframe/core/task_tree.py`:
- Around line 26-29: The recursive decomposition currently only requests and
persists 'title'/'description' (DECOMPOSE_SYSTEM_PROMPT and the code paths that
parse/save decomposed children), losing richer task metadata; update the
decomposition prompt (DECOMPOSE_SYSTEM_PROMPT) to require the full task schema
(complexity, estimate, dependencies, files, acceptance_criteria, etc.) and
modify the recursive parse/save logic (the functions that parse the
decomposition response and the code that creates/persists child nodes in
TaskTree) to accept and persist all those fields instead of only
title/description/tree fields so downstream features receive full metadata.
- Around line 396-400: The parent status update in propagate_status mutates
state via task_module.update_status(parent.id) but doesn't emit the required
audit event; after the successful change where new_status != parent.status, call
the core events API (import core.events) to emit a status-transition event
(e.g., core.events.emit_status_transition or the project's equivalent) with the
task id, old status (parent.status), new_status and any workspace/context so all
auto-propagated transitions are recorded; ensure the emit happens only when
update_status is invoked and include the same identifiers used in
propagate_status and task_module.update_status.
- Around line 305-308: The sort key currently treats hierarchical_id as a string
so "1.10" sorts before "1.2"; change the key used when sorting children_map
entries so it converts t.hierarchical_id into a tuple of integers (e.g.,
tuple(map(int, t.hierarchical_id.split('.')))) and fall back to t.title or a
safe tuple when hierarchical_id is None or contains non-numeric parts; update
the sort call on children_map (the lambda at the children_map[pid].sort(...)
line) to parse and return the numeric tuple for proper numeric hierarchical
ordering.

In `@codeframe/core/tasks.py`:
- Around line 559-561: The task creation currently only persists
complexity_score, estimated_hours, and uncertainty_level from task_data and
flattens/loses other structured metadata; update the Task construction (where
complexity_score, estimated_hours, uncertainty_level are set) to also assign
files_to_modify=task_data.get("files_to_modify"),
files_to_create=task_data.get("files_to_create"), and
acceptance_criteria=task_data.get("acceptance_criteria") instead of folding them
into description, and apply the same fix to the other parsing/creation sites
referenced (the blocks around lines 592-599 and 643-655) so the structured
metadata survives parsing and downstream callers can consume those fields.
- Around line 631-637: The conversions for task["complexity"] and
task["estimated_hours"] are unsafe for untrusted LLM output; wrap the int() and
float() conversions in try/except (catch ValueError and TypeError), clamp valid
parsed values as before (complexity -> max(1,min(5,int(...))), estimated_hours
-> max(0.1,float(...))) and on parse failure log a warning and set the field to
None or a safe default so generate_from_prd() does not treat a single malformed
value as a hard failure; update the code around the task dict handling where
complexity and estimated_hours are parsed to implement this defensive parsing
and logging.
- Around line 565-571: The current resolution loop silently drops dependency
titles that don't exactly match and never refreshes created_tasks with the
updated objects returned by update_depends_on; change the logic in the block
that builds title_to_id and iterates over tasks_data/created_tasks so that it
(1) detects and surfaces missing or duplicate titles from
tasks_data.get("depends_on_titles") (raise an exception or return a clear
error/log instead of silently skipping) and (2) assigns the result of
update_depends_on(workspace, task.id, dep_ids) back into created_tasks (i.e.,
replace the task in created_tasks with the returned Task) so callers receive
up-to-date depends_on values; retain references to title_to_id, created_tasks,
update_depends_on, tasks_data, and depends_on_titles when making these changes.

In `@codeframe/core/workspace.py`:
- Around line 424-435: The upgrade path in _ensure_schema_upgrades() must add
the github_issue_number column so tasks.py's unconditional SELECT won't fail;
add a check like the existing ones for "github_issue_number" on the tasks table
and run an ALTER TABLE tasks ADD COLUMN github_issue_number TEXT (or INTEGER)
DEFAULT NULL followed by conn.commit(); update the block that checks
task_columns (in codeframe/core/workspace.py, the function
_ensure_schema_upgrades) to include this ALTER and ensure the schema is present
before any reads from codeframe/core/tasks.py.

In `@tests/core/test_context_packager.py`:
- Around line 465-542: The module tests/core/test_context_packager.py contains
new v2 lineage tests (e.g. TestLineageContext and its methods like
test_context_packager_includes_lineage) but lacks the module-level v2 marker;
add an import for pytest and set pytestmark = pytest.mark.v2 at the top of the
file so all tests in this module are marked v2 per the repo testing rules.

In `@tests/core/test_rich_task_generation.py`:
- Around line 10-13: Remove the unused import PrdRecord from the test file to
satisfy the linter: edit the imports at the top of
tests/core/test_rich_task_generation.py and delete the reference to PrdRecord so
only used symbols (MockProvider, tasks, create_or_load_workspace) remain; ensure
no other references to PrdRecord exist in the file before committing.

---

Outside diff comments:
In `@codeframe/cli/app.py`:
- Around line 1597-1616: The recursive branch currently runs LLM-only logic even
when --no-llm is passed; make the flags mutually exclusive by validating them
before using LLM functions: add a check that if recursive and no_llm are both
true, print an error (or raise/exit) and abort, or change the recursive
condition to "if recursive and not no_llm" so the code only calls
get_provider(), generate_task_tree, and flatten_task_tree when an LLM is
allowed; also ensure require_anthropic_api_key() is only called when an LLM will
actually be used (i.e., when not no_llm).

---

Nitpick comments:
In `@tests/cli/test_tasks_tree_cli.py`:
- Around line 71-77: Add an explicit exit-code assertion after the CLI
invocation to ensure the command completed successfully: after the
runner.invoke(app, ["tasks", "generate", "--recursive", "-w", str(tmp_path)])
call, assert result.exit_code == 0 (do the same for the other similar test later
in the file that also calls runner.invoke and asserts
mock_gen_tree/mock_flatten). This ensures the test fails if the CLI crashes even
when the mocks were called.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0c296c73-8567-4445-9b9a-c4367ed4656a

📥 Commits

Reviewing files that changed from the base of the PR and between cf3fa89 and 6dd347a.

📒 Files selected for processing (10)

codeframe/cli/app.py
codeframe/core/context_packager.py
codeframe/core/task_tree.py
codeframe/core/tasks.py
codeframe/core/workspace.py
tests/cli/test_tasks_tree_cli.py
tests/core/test_context_packager.py
tests/core/test_rich_task_generation.py
tests/core/test_task_tree.py
tests/core/test_task_tree_schema.py

coderabbitai · 2026-03-15T04:09:51Z

+DECOMPOSE_SYSTEM_PROMPT = (
+    "Break this task into 2-7 concrete subtasks. Each should be actionable and "
+    "testable. Return a JSON array of objects with 'title' and 'description' fields."
+)


⚠️ Potential issue | 🟠 Major

Recursive flow currently drops rich task metadata.

The recursive decomposition contract only captures title/description, and Line 254-263 persists only those plus tree fields. That means recursive-generated tasks miss richer generation data (e.g., complexity/estimate/dependencies/files/acceptance criteria), so downstream features won’t get populated in this path.

Also applies to: 121-160, 254-263

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@codeframe/core/task_tree.py` around lines 26 - 29, The recursive decomposition currently only requests and persists 'title'/'description' (DECOMPOSE_SYSTEM_PROMPT and the code paths that parse/save decomposed children), losing richer task metadata; update the decomposition prompt (DECOMPOSE_SYSTEM_PROMPT) to require the full task schema (complexity, estimate, dependencies, files, acceptance_criteria, etc.) and modify the recursive parse/save logic (the functions that parse the decomposition response and the code that creates/persists child nodes in TaskTree) to accept and persist all those fields instead of only title/description/tree fields so downstream features receive full metadata.

coderabbitai · 2026-03-15T04:09:51Z

+    for pid in children_map:
+        children_map[pid].sort(
+            key=lambda t: t.hierarchical_id or t.title
+        )


⚠️ Potential issue | 🟡 Minor

Sort key for hierarchical_id is lexicographic, not numeric.

Line 305-308 will order 1.10 before 1.2. This breaks tree display ordering once siblings exceed 9.

💡 Suggested fix

+def _hierarchical_sort_key(task) -> tuple: + hid = task.hierarchical_id + if hid: + parts = [] + for p in str(hid).split("."): + parts.append(int(p) if p.isdigit() else p) + return (0, parts) + return (1, task.title.lower()) + # Sort children by hierarchical_id if available, else by title for pid in children_map: - children_map[pid].sort( - key=lambda t: t.hierarchical_id or t.title - ) + children_map[pid].sort(key=_hierarchical_sort_key)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for pid in children_map:

children_map[pid].sort(

key=lambda t: t.hierarchical_id or t.title

)

def _hierarchical_sort_key(task) -> tuple:

hid = task.hierarchical_id

if hid:

parts = []

for p in str(hid).split("."):

parts.append(int(p) if p.isdigit() else p)

return (0, parts)

return (1, task.title.lower())

# Sort children by hierarchical_id if available, else by title

for pid in children_map:

children_map[pid].sort(key=_hierarchical_sort_key)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@codeframe/core/task_tree.py` around lines 305 - 308, The sort key currently treats hierarchical_id as a string so "1.10" sorts before "1.2"; change the key used when sorting children_map entries so it converts t.hierarchical_id into a tuple of integers (e.g., tuple(map(int, t.hierarchical_id.split('.')))) and fall back to t.title or a safe tuple when hierarchical_id is None or contains non-numeric parts; update the sort call on children_map (the lambda at the children_map[pid].sort(...) line) to parse and return the numeric tuple for proper numeric hierarchical ordering.

coderabbitai · 2026-03-15T04:09:51Z

+    if new_status and new_status != parent.status:
+        task_module.update_status(workspace, parent.id, new_status)
+
+    # Recursively propagate upward
+    propagate_status(workspace, parent.id)


⚠️ Potential issue | 🟠 Major

Emit events when propagated parent status changes.

Line 397 mutates parent task state but does not emit a status-transition event, so audit/observability is incomplete for auto-propagated transitions.

💡 Suggested fix

+from codeframe.core.events import emit_for_workspace, EventType @@ if new_status and new_status != parent.status: + old_status = parent.status task_module.update_status(workspace, parent.id, new_status) + emit_for_workspace( + workspace, + EventType.TASK_STATUS_CHANGED, + { + "task_id": parent.id, + "old_status": old_status.value, + "new_status": new_status.value, + }, + print_event=False, + )

As per coding guidelines codeframe/core/**/*.py: “All core modules must emit events for state transitions via core/events.py for audit and observability”.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if new_status and new_status != parent.status:

task_module.update_status(workspace, parent.id, new_status)

# Recursively propagate upward

propagate_status(workspace, parent.id)

from codeframe.core.events import emit_for_workspace, EventType

if new_status and new_status != parent.status:

old_status = parent.status

task_module.update_status(workspace, parent.id, new_status)

emit_for_workspace(

workspace,

EventType.TASK_STATUS_CHANGED,

{

"task_id": parent.id,

"old_status": old_status.value,

"new_status": new_status.value,

},

print_event=False,

)

# Recursively propagate upward

propagate_status(workspace, parent.id)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@codeframe/core/task_tree.py` around lines 396 - 400, The parent status update in propagate_status mutates state via task_module.update_status(parent.id) but doesn't emit the required audit event; after the successful change where new_status != parent.status, call the core events API (import core.events) to emit a status-transition event (e.g., core.events.emit_status_transition or the project's equivalent) with the task id, old status (parent.status), new_status and any workspace/context so all auto-propagated transitions are recorded; ensure the emit happens only when update_status is invoked and include the same identifiers used in propagate_status and task_module.update_status.

coderabbitai · 2026-03-15T04:09:51Z

+            complexity_score=task_data.get("complexity"),
+            estimated_hours=task_data.get("estimated_hours"),
+            uncertainty_level=task_data.get("uncertainty"),


⚠️ Potential issue | 🟠 Major

The richer metadata contract is still lossy.

This flow only persists complexity/hours/uncertainty. files_to_modify gets flattened into description, and files_to_create / acceptance_criteria never survive parsing at all, so downstream callers still cannot consume the structured metadata this PR is supposed to add.

Also applies to: 592-599, 643-655

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@codeframe/core/tasks.py` around lines 559 - 561, The task creation currently only persists complexity_score, estimated_hours, and uncertainty_level from task_data and flattens/loses other structured metadata; update the Task construction (where complexity_score, estimated_hours, uncertainty_level are set) to also assign files_to_modify=task_data.get("files_to_modify"), files_to_create=task_data.get("files_to_create"), and acceptance_criteria=task_data.get("acceptance_criteria") instead of folding them into description, and apply the same fix to the other parsing/creation sites referenced (the blocks around lines 592-599 and 643-655) so the structured metadata survives parsing and downstream callers can consume those fields.

coderabbitai · 2026-03-15T04:09:51Z

+    # Resolve title-based dependencies to task IDs
+    title_to_id = {t.title: t.id for t in created_tasks}
+    for task_data, task in zip(tasks_data, created_tasks):
+        dep_titles = task_data.get("depends_on_titles", [])
+        dep_ids = [title_to_id[t] for t in dep_titles if t in title_to_id]
+        if dep_ids:
+            update_depends_on(workspace, task.id, dep_ids)


⚠️ Potential issue | 🟠 Major

Dependency resolution should not silently weaken the graph or return stale tasks.

A typo, truncation mismatch, or duplicate title just disappears here because only exact matches survive. Also, update_depends_on() returns the updated Task, but created_tasks is never refreshed, so callers get stale depends_on values back even when resolution succeeds.

Proposed fix

- title_to_id = {t.title: t.id for t in created_tasks} - for task_data, task in zip(tasks_data, created_tasks): + title_to_id = {t.title: t.id for t in created_tasks} + if len(title_to_id) != len(created_tasks): + raise ValueError("Duplicate task titles prevent dependency resolution") + + for i, (task_data, task) in enumerate(zip(tasks_data, created_tasks)): dep_titles = task_data.get("depends_on_titles", []) - dep_ids = [title_to_id[t] for t in dep_titles if t in title_to_id] + unresolved = [title for title in dep_titles if title not in title_to_id] + if unresolved: + raise ValueError( + f"Unresolved dependencies for {task.title}: {', '.join(unresolved)}" + ) + + dep_ids = [title_to_id[title] for title in dep_titles] if dep_ids: - update_depends_on(workspace, task.id, dep_ids) + created_tasks[i] = update_depends_on(workspace, task.id, dep_ids)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@codeframe/core/tasks.py` around lines 565 - 571, The current resolution loop silently drops dependency titles that don't exactly match and never refreshes created_tasks with the updated objects returned by update_depends_on; change the logic in the block that builds title_to_id and iterates over tasks_data/created_tasks so that it (1) detects and surfaces missing or duplicate titles from tasks_data.get("depends_on_titles") (raise an exception or return a clear error/log instead of silently skipping) and (2) assigns the result of update_depends_on(workspace, task.id, dep_ids) back into created_tasks (i.e., replace the task in created_tasks with the returned Task) so callers receive up-to-date depends_on values; retain references to title_to_id, created_tasks, update_depends_on, tasks_data, and depends_on_titles when making these changes.

coderabbitai · 2026-03-15T04:09:51Z

+        complexity = task.get("complexity")
+        if complexity is not None:
+            complexity = max(1, min(5, int(complexity)))
+
+        estimated_hours = task.get("estimated_hours")
+        if estimated_hours is not None:
+            estimated_hours = max(0.1, float(estimated_hours))


⚠️ Potential issue | 🟠 Major

Handle malformed numeric metadata defensively.

These values come from untrusted LLM output. int() / float() can raise here, and generate_from_prd() currently treats ValueError as a hard config failure, so one bad "complexity" or "estimated_hours" field can abort the entire generation path instead of degrading gracefully.

Proposed fix

- complexity = task.get("complexity") - if complexity is not None: - complexity = max(1, min(5, int(complexity))) + complexity = task.get("complexity") + try: + complexity = ( + max(1, min(5, int(complexity))) + if complexity is not None + else None + ) + except (TypeError, ValueError): + complexity = None - estimated_hours = task.get("estimated_hours") - if estimated_hours is not None: - estimated_hours = max(0.1, float(estimated_hours)) + estimated_hours = task.get("estimated_hours") + try: + estimated_hours = ( + max(0.1, float(estimated_hours)) + if estimated_hours is not None + else None + ) + except (TypeError, ValueError): + estimated_hours = None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@codeframe/core/tasks.py` around lines 631 - 637, The conversions for task["complexity"] and task["estimated_hours"] are unsafe for untrusted LLM output; wrap the int() and float() conversions in try/except (catch ValueError and TypeError), clamp valid parsed values as before (complexity -> max(1,min(5,int(...))), estimated_hours -> max(0.1,float(...))) and on parse failure log a warning and set the field to None or a safe default so generate_from_prd() does not treat a single malformed value as a hard failure; update the code around the task dict handling where complexity and estimated_hours are parsed to implement this defensive parsing and logging.

coderabbitai · 2026-03-15T04:09:51Z

+        if "parent_id" not in task_columns:
+            cursor.execute("ALTER TABLE tasks ADD COLUMN parent_id TEXT")
+            conn.commit()
+        if "lineage" not in task_columns:
+            cursor.execute("ALTER TABLE tasks ADD COLUMN lineage TEXT DEFAULT '[]'")
+            conn.commit()
+        if "is_leaf" not in task_columns:
+            cursor.execute("ALTER TABLE tasks ADD COLUMN is_leaf INTEGER DEFAULT 1")
+            conn.commit()
+        if "hierarchical_id" not in task_columns:
+            cursor.execute("ALTER TABLE tasks ADD COLUMN hierarchical_id TEXT")
+            conn.commit()


⚠️ Potential issue | 🔴 Critical

Backfill github_issue_number in the upgrade path before shipping this schema change.

codeframe/core/tasks.py now selects github_issue_number unconditionally, but _ensure_schema_upgrades() still never adds that column for pre-existing workspaces. On an older DB, the first task read will fail with sqlite3.OperationalError: no such column: github_issue_number.

Proposed fix

if "uncertainty_level" not in task_columns: cursor.execute("ALTER TABLE tasks ADD COLUMN uncertainty_level TEXT") conn.commit() + if "github_issue_number" not in task_columns: + cursor.execute("ALTER TABLE tasks ADD COLUMN github_issue_number INTEGER") + conn.commit() if "parent_id" not in task_columns: cursor.execute("ALTER TABLE tasks ADD COLUMN parent_id TEXT") conn.commit()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@codeframe/core/workspace.py` around lines 424 - 435, The upgrade path in _ensure_schema_upgrades() must add the github_issue_number column so tasks.py's unconditional SELECT won't fail; add a check like the existing ones for "github_issue_number" on the tasks table and run an ALTER TABLE tasks ADD COLUMN github_issue_number TEXT (or INTEGER) DEFAULT NULL followed by conn.commit(); update the block that checks task_columns (in codeframe/core/workspace.py, the function _ensure_schema_upgrades) to include this ALTER and ensure the schema is present before any reads from codeframe/core/tasks.py.

coderabbitai · 2026-03-15T04:09:51Z

+class TestLineageContext:
+    """Tests for lineage inclusion in build() prompt."""
+
+    def test_context_packager_includes_lineage(self, mock_workspace):
+        """Task with lineage should have 'Task Lineage' section in prompt."""
+        ctx = MagicMock(spec=TaskContext)
+        ctx.task = MagicMock()
+        ctx.task.lineage = ["Build app", "Authentication module"]
+        ctx.to_prompt_context.return_value = (
+            "## Task\n**Title:** Implement JWT\n**Description:** Add tokens\n"
+        )
+        ctx.relevant_files = []
+
+        with patch("codeframe.core.context_packager.ContextLoader") as MockLoader:
+            MockLoader.return_value.load.return_value = ctx
+
+            packager = TaskContextPackager(mock_workspace)
+            result = packager.build("task-1")
+
+            assert "Task Lineage" in result.prompt
+            assert "Build app" in result.prompt
+            assert "Authentication module" in result.prompt
+
+    def test_context_packager_no_lineage(self, mock_workspace):
+        """Task without lineage should not have 'Task Lineage' section."""
+        ctx = MagicMock(spec=TaskContext)
+        ctx.task = MagicMock()
+        ctx.task.lineage = []
+        ctx.to_prompt_context.return_value = (
+            "## Task\n**Title:** Simple task\n**Description:** Do it\n"
+        )
+        ctx.relevant_files = []
+
+        with patch("codeframe.core.context_packager.ContextLoader") as MockLoader:
+            MockLoader.return_value.load.return_value = ctx
+
+            packager = TaskContextPackager(mock_workspace)
+            result = packager.build("task-1")
+
+            assert "Task Lineage" not in result.prompt
+
+    def test_context_packager_lineage_missing_attribute(self, mock_workspace):
+        """Task without lineage attribute should not have 'Task Lineage' section."""
+        ctx = MagicMock(spec=TaskContext)
+        ctx.task = MagicMock(spec=["title", "description", "id"])
+        # No lineage attribute on task
+        ctx.to_prompt_context.return_value = (
+            "## Task\n**Title:** Old task\n**Description:** Legacy\n"
+        )
+        ctx.relevant_files = []
+
+        with patch("codeframe.core.context_packager.ContextLoader") as MockLoader:
+            MockLoader.return_value.load.return_value = ctx
+
+            packager = TaskContextPackager(mock_workspace)
+            result = packager.build("task-1")
+
+            assert "Task Lineage" not in result.prompt
+
+    def test_lineage_appears_before_gates(self, mock_workspace):
+        """Lineage section should appear before Verification Gates."""
+        ctx = MagicMock(spec=TaskContext)
+        ctx.task = MagicMock()
+        ctx.task.lineage = ["Parent task"]
+        ctx.to_prompt_context.return_value = (
+            "## Task\n**Title:** Child task\n"
+        )
+        ctx.relevant_files = []
+
+        with patch("codeframe.core.context_packager.ContextLoader") as MockLoader:
+            MockLoader.return_value.load.return_value = ctx
+
+            packager = TaskContextPackager(mock_workspace)
+            result = packager.build("task-1")
+
+            lineage_pos = result.prompt.index("Task Lineage")
+            gates_pos = result.prompt.index("Verification Gates")
+            assert lineage_pos < gates_pos


⚠️ Potential issue | 🟠 Major

Add the missing v2 marker for this module.

These new lineage tests cover v2 functionality, but the file still lacks a module-level pytestmark = pytest.mark.v2, so they won't follow the repo's test-selection rules.

Proposed fix

import pytest from pathlib import Path from unittest.mock import MagicMock, patch +pytestmark = pytest.mark.v2 + from codeframe.core.context_packager import TaskContextPackager, PackagedContext from codeframe.core.context import TaskContext, FileInfo from codeframe.core.adapters.agent_adapter import AgentContext

As per coding guidelines "`tests/**/*.py`: New v2 Python tests must be marked with `@pytest.mark.v2` decorator or `pytestmark = pytest.mark.v2`."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/core/test_context_packager.py` around lines 465 - 542, The module tests/core/test_context_packager.py contains new v2 lineage tests (e.g. TestLineageContext and its methods like test_context_packager_includes_lineage) but lacks the module-level v2 marker; add an import for pytest and set pytestmark = pytest.mark.v2 at the top of the file so all tests in this module are marked v2 per the repo testing rules.

claude · 2026-03-15T04:15:26Z

PR Review: feat(tasks): richer task generation with recursive decomposition

This is a substantial and well-structured PR. The module separation is clean (task_tree.py is properly headless), test coverage is solid (51 tests), and the schema migration approach is consistent with the existing pattern. A few critical issues to address before merging.

Critical Issues

1. generate_task_tree receives the full PRD as description, not a task title

In cli/app.py, the recursive path calls generate_task_tree(provider, prd_record.content, lineage=[], depth=0, max_depth=max_depth). The function calls classify_task on this string, asking the LLM 'is this atomic or composite?' - but it is a full PRD, not a task description. The LLM will almost certainly classify it as composite and decompose_task will try to break the entire PRD into 2-7 subtasks, which is not the intended behavior. The correct flow: first extract top-level tasks from the PRD (as _generate_tasks_with_llm does today), then recursively decompose each task. Without this fix, the recursive path produces a fundamentally different structure than what the feature description implies.

2. propagate_status is defined but never called

The PR description says 'Children done, parent auto-completes' but propagate_status is not wired into runtime.py or state_machine.py. Status propagation will never fire during normal agent execution. Either wire it in (e.g., call it from complete_run() / fail_run() after update_status) or document clearly that it is a utility for future integration.

3. LLM call explosion with no user warning

With max_depth=3 and composite tasks generating up to 7 subtasks, worst-case is approximately 800 LLM calls for a single 'cf tasks generate --recursive'. The user gets no cost estimate or call-count warning. At max_depth=5 (the CLI limit), this becomes catastrophically expensive. At minimum, print an estimated call-count warning or require an explicit --confirm flag.

Moderate Issues

4. _row_to_task positional indexing is fragile

The new fields use positional row[14], row[15], row[16], row[17] with len(row) > N guards. This already existed for github_issue_number at row[13] and is a known maintenance risk - any query path reusing _row_to_task without the new columns selected will silently return wrong data. Consider using sqlite3.Row factory with named column access (row['parent_id']) to make this robust across future refactors.

5. Hierarchical ID sorting uses string comparison

children_map[pid].sort(key=lambda t: t.hierarchical_id or t.title) - String sort puts '1.10' before '1.2'. Fix: key=lambda t: [int(p) for p in t.hierarchical_id.split('.')] if t.hierarchical_id else [999]

6. files_to_modify is appended to description rather than stored

The LLM is asked to provide files_to_modify but the values are concatenated into the description string. This pollutes the description field and makes the data unusable programmatically. Either add a dedicated model field, or drop it from the prompt until it can be stored cleanly.

Minor Observations

test_display_empty_workspace has a dead-code 'or' branch: display_task_tree always returns 'No tasks found.' so the empty-string check is unreachable.
The test helper _update_tree_fields writes raw SQL inside a test file. Acceptable short-term, but consider moving to a tasks.py internal helper if more tests need it.
classify_task uses Purpose.PLANNING (Sonnet). Since it needs only a single-word answer, Purpose.GENERATION (Haiku) would be faster and cheaper.

What Is Working Well

Clean headless module design: task_tree.py has no FastAPI imports, consistent with architecture rules.
Backward-compatible schema migration follows the existing _ensure_schema_upgrades pattern correctly.
Rich metadata fields (complexity_score, estimated_hours, uncertainty_level, depends_on_titles) in the standard LLM prompt are immediately useful and non-breaking.
_parse_subtasks handles both raw JSON and markdown-wrapped JSON - good defensive parsing.
51 tests with solid coverage of edge cases (clamping, fallbacks, multi-level propagation).

Two blockers before merge: (1) the PRD-as-description design bug in generate_task_tree, and (2) propagate_status not being wired in. The other items can be tracked as follow-up issues.

Test User added 4 commits March 14, 2026 20:14

feat(core): add tree-structure fields to Task model (#420 step 1)

a6a8c2a

Add parent_id, lineage, is_leaf, and hierarchical_id fields to support hierarchical task decomposition. Includes DB schema, migrations, and CRUD operations with full backward compatibility.

fix: remove unused PrdRecord import in test_rich_task_generation

1250ded

coderabbitai Bot reviewed Mar 15, 2026

View reviewed changes

frankbria merged commit 1a15a27 into main Mar 15, 2026
28 of 29 checks passed

Repository owner deleted a comment from claude Bot Mar 15, 2026

coderabbitai Bot mentioned this pull request Mar 24, 2026

feat(web-ui): task → PRD requirement traceability badges (#468) #490

Merged

5 tasks

frankbria deleted the feature/issue-420-richer-task-generation branch March 24, 2026 23:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tasks): richer task generation with recursive decomposition (#420)#444

feat(tasks): richer task generation with recursive decomposition (#420)#444
frankbria merged 5 commits into
mainfrom
feature/issue-420-richer-task-generation

frankbria commented Mar 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 15, 2026 •

edited

Loading

Review failed

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Uh oh!

coderabbitai Bot Mar 15, 2026

Uh oh!

coderabbitai Bot Mar 15, 2026

Uh oh!

coderabbitai Bot Mar 15, 2026

Uh oh!

coderabbitai Bot Mar 15, 2026

Uh oh!

coderabbitai Bot Mar 15, 2026

Uh oh!

coderabbitai Bot Mar 15, 2026

Uh oh!

coderabbitai Bot Mar 15, 2026

Uh oh!

Uh oh!

Uh oh!

claude Bot commented Mar 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-    for pid in children_map:
-        children_map[pid].sort(
-            key=lambda t: t.hierarchical_id or t.title
-        )
+def _hierarchical_sort_key(task) -> tuple:
+    hid = task.hierarchical_id
+    if hid:
+        parts = []
+        for p in str(hid).split("."):
+            parts.append(int(p) if p.isdigit() else p)
+        return (0, parts)
+    return (1, task.title.lower())
+# Sort children by hierarchical_id if available, else by title
+for pid in children_map:
+    children_map[pid].sort(key=_hierarchical_sort_key)

Conversation

frankbria commented Mar 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files Changed

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagrams

Estimated Code Review Effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

claude Bot commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

frankbria commented Mar 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 15, 2026 •

edited

Loading

claude Bot commented Mar 15, 2026 •

edited

Loading